.\" Exonerate man Page
.TH exonerate-server 1 "January 2008" exonerate-server "sequence comparison server"
.SH NAME
.\"
exonerate-server \- a sequence comparison server for exonerate
.\"

.SH SYNOPSIS
.B exonerate-server [ options ] <index path>
.\"

.SH DESCRIPTION
.BR exonerate-server
is a server for the exonerate sequence alignment program.

It uses a set of sequences and a corresponding index file
to allow fast of large datasets.
.\"

.RE
.SH OVERVIEW
.T
.\"
Firstly, an
.B .esd
file must be made from the sequence files.
The
.B .esd
file is an Exonerate Sequence Dataset
file, and can be used to group together any set
of sequences where each sequences containing unique identifiers.
This is done by using the
.B fasta2esd
utility.
.RS
.TP
.B "fasta2esd genome.fasta genome.esd"
.P
.RE
Next, an
.B .esi
file my be made from the
.B .esd
file.
The
.B .esi
file is an Exonerate Sequence Index
file, and contains an index or set of indices corresponding
to a particular dataset.
This is done by using the
.B esd2esi
utility.
.\"
.RS
.TP
.B "esd2esi genome.esd genome.esi"
.P
.RE
Once the
.B .esi
file has been generated, the exonerate-server may be started.
.RS
.TP
.B "exonerate-server genome.esi"
.P
.RE
While the server is running, exonerate may be used to query
the server by replacing the target sequences in the command line
with the name of the server and port number.
The default port number for the exonerate-server is 12886.
.RS
.TP
.B "exonerate query.fasta localhost:12886"
.P
.RE

.RE
.SH OPTIONS
.T
Some of the command line options for the exonerate-server are the same as for the
exonerate client, and these are documented in the man page for
.B exonerate.
The other options which are specific to
.B exonerate-server
are documented here.
.TP
.B "\--port" <port>
Specify the port on which the server should listen.
By default,
.B exonerate-server
will listen on port 12886, but alternative ports may be specified with this
option.
.\"
.TP
.B "\--input" <index file>
Specify the index file to be used when the server is started.
This option is mandatory.  The index file is a
.B .esi
file generated by the
.B esd2esi
utility.
.\"
.TP
.B "\--preload" <boolean>
By default the indices contained in the
.B .esi
file, and the sequences referenced in the corresponding
.B .esd
file are loaded into memory when the server is started.
This is necessary to achieve fast performance
that would otherwise be hampered by frequent disk accesses.
This option allows the index and sequence preloading to be turned
off, which allows the server to run much more slowly,
but with faster startup and a smaller memory footprint.
It is not advised to turn preloading off unless testing or debugging
the server.
.\"
.TP
.B "\--maxconnections" <count>
Set the number client processes
which are allowed to connect to the server simultaneously.
For good performance, it should not be set to more than the number of CPUs
on the machine on which the server is running.
.\"
.TP
.B "\--verbosity" <level>
Set the verbosity level for the server.  If it is zero, the server
will be silent, and the higher the number, the more messages
are reported by the server about what is happening.
.\"

.RE
.SH INTERFACE
.T
This section documents the communication interface between
the client and server.
The interface is documented for people wishing to write their own
custom server to sit behind exonerate
- for normal use of exonerate, it is not necessary to know this.
.\"
.P
The interface works by the client sending simple command lines
and the server sending simple reply lines over a socket.
All the commands and replies are simple lines of ASCII text,
so it is possible to use telnet as a client for testing a server.
.\"
.P
Any command is a single line of text, but a reply may contain
many lines of text.  The replies are in the form of
.B <tag>: <message>
.\"
.P
Any reply can include lines with the tag
.B warning:
or
.B error:
.
These
.B warning:
and
.B error:
tags are echoed by the client, and the client will
exit after receiving any
.B error:
reply.
.P
.\"
When the server is returning a multiline reply, the first
line must show the number of lines in the whole reply as:
.B linecount: <count>
For examples, see the replies from the
.B "get hsps"
commands in the example session below.
.P
.\"
The client will only open a single connection to any server.
A server will accept as many simultaneous client connections as is specified
by the --maxconnections option or the
EXONERATE_EXONERATE_SERVER_MAXCONNECTIONS environment variable.
.\"
.P
.SS Commands and replies used in for the interface.
.P
.PP
.PD 0
.TP 10
Command:
.B version
.TP 10
Reply:
version <server name> <server version>

.TP 10
Command:
.B exit
.TP 10
Reply:
( no reply - server closes connection )

.TP 10
Command:
.B dbinfo
.TP 10
Reply:
dbinfo: <type> <masked> <num_seqs> <max_seq_len> <total_seq_len>

The
.B dbinfo
command returns information about the database loaded on the server.
The returned fields are:

.RS
.PD 0
.TP 16
.B <type>
either dna or protein
.TP 16
.B <masked>
either softmasked or unmasked
.TP 16
.B <num_seqs>
the number of sequences in the database
.TP 16
.B <max_seq_len>
the length of the longest sequence in the database
.TP 16
.B <total_seq_len>
the total length of all the sequences in the database
.RE

.TP 10
Command:
.B lookup
<eid>
.TP 10
Reply:
lookup: <iid>

The lookup command is used to map an external identifier to an internal
identifier.

.TP 10
Command:
.B get info
<iid>
.TP 10
Reply:
seqinfo: <len> <checksum> <eid> [ <def> ]

The get info command returns information about a sequence in the database.
The returned fields are:

.RS
.TP 16
.B <len>
the sequence length
.TP 16
.B <checksum>
a gcg format checksum (see below)
.TP 16
.B <eid>
the external id (eg. from fasta header)
.TP 16
.B <def>
a description line for the sequence (also from the fasta header),
this field is optional an may be ommitted.
.RE

.TP 10
Command:
.B get seq
<iid>
.TP 10
Reply:
seq: <seq>

The get seq command returns a whole sequence on one line.

.TP 10
Command:
.B get subseq
<iid> <start> <len>
.TP 10
Reply:
subseq: <sequence>

The get subseq command returns part of a sequence.
The start of the sequence is position zero.
eg. get subseq 0 0 10
will return the first 10 bases of the first sequence in the database.

.TP 10
Command:
.B set query
<seq>
.TP 10
Reply:
ok: <len> <checksum>

The seq query command is used to send a query sequence to the server.
It returns the length of the sequence and a gcg checksum 

.TP 10
Command:
.B revcomp
<query | target>
.TP 10
Reply:
ok: <query | target> strand <forward | revcomp>

The revcomp query command makes the server
reverse complement the query.  This is to save the bandwidth
of sending the query twice.

The revcomp target command is to tell the server
to treat the database as its reverse complement.
The client only sends this command when searching a translated
database, so need not be implemented for most types of search.

.TP 10
Command:
.B set param
<name> <value>
.TP 10
Reply:
ok: <set | ignored>

The set parameter command sends parameters from the exonerate command line
to the server.  This commands can all be ignored by the client for a basic
implementation, but cannot be ignored for optimal performance.

.TP 10
Command:
.B get hsps
.TP 10
Reply:
hspset: <iid> { <query_pos> <target_pos> <length> }
.TP 10
Or:
hspset: empty

The get hsps command is the main command for getting sets of hsps.
The server may return multiple hspsets.
The returned fields are:

.RS
.TP 16
.B <iid>
The internal id of the target sequence for these HSPsets.
.TP 16
.B <query_pos>
The hsp query start position
.TP 16
.B <target_pos>
The hsp target start position
.TP 16
.B <length>
The hsp length
.RE

.RS
The last three fields represent an HSP, and may be repeated many times
on one
.B hspset:
reply line.
.RE
.PD
.PP
.\"
.SS A simple example client server dialog.
.P
.PP
.nf
.SP

% telnet localhost 12886
Trying 127.0.0.1...
Connected to localhost.localdomain.
Escape character is '^]'.
% version
version: exonerate-server 2.0.0
% dbinfo
dbinfo: dna softmasked 100000 1701 38113579
% lookup AA159529.1
lookup: 88065
% get info 88065
seqinfo: 62 2028 AA159529.1 zo72g05.s1 Stratagene pancreas (#937208) Homo sapiens cDNA
% get seq 88065
seq: NAACTCATCNTTTTCTGCTGNATCCTCTTCACCAGTTTGGGGGANGGCCTGCACTTCCANAG
% get subseq 88065 10 20
subseq: TTTTCTGCTGNATCCTCTTC
% set query NAACTCATCNTTTTCTGCTGNATCCTCTTCACCAGTTTGGGGGANGGCCTGCACTTCCANAG
ok: 62 2028
% get hsps
linecount: 15
hspset: 12423 1 349 41
hspset: 44900 1 356 47
hspset: 61781 1 358 41 36 392 26
hspset: 70065 1 349 41 36 383 26
hspset: 88065 1 1 61
hspset: 91032 1 357 41 36 391 26
hspset: 91442 1 350 41 36 384 26
hspset: 92971 1 348 41 36 382 26
hspset: 94311 1 375 41
hspset: 95381 1 346 41 36 380 26
hspset: 96808 10 385 32 36 410 26
hspset: 88449 18 11 22
hspset: 91036 6 6 56
hspset: 93736 36 400 26
% revcomp query
ok: query strand revcomp
% get hsps
linecount: 6
hspset: 12564 0 64 26 20 83 41
hspset: 61780 0 266 61
hspset: 29148 0 116 61
hspset: 25849 15 445 22
hspset: 93938 26 265 34
% exit
Connection closed by foreign host.
.SP
.fi

.\"
.SH ENVIRONMENT
Not documented yet.
.\"
.SH EXAMPLES
.\"
.B 1.
Example of creating a translated index
and running a fast protein2genome search using exonerate-server

.B fasta2esd
human.genomic.fasta human.genomic.esd
.B esd2esi
--translate yes human.genomic.esd human.genomic.trans.esi
.B exonerate-server
--port 1234 human.genomic.trans.esi
.B exonerate
pep.fasta localhost:1234 --model p2g --seedrepeat 3 --geneseed 250

.\"
.RE
.SH VERSION
This documentation accompanies version 2.2.0 of the exonerate package.
.\"
.SH AUTHOR
Guy St.C. Slater.  <guy@ebi.ac.uk>.
.L
See the AUTHORS file accompanying the source code
for a list of contributors.
.SH AVAILABILITY
This source code for the exonerate package is available
under the terms of the GNU
.I general
public licence.

Please see the file COPYING which was distrubuted with this package,
or http://www.gnu.org/licenses/gpl.txt for details.

This package has been developed as part of the ensembl project.
Please see http://www.ensembl.org/ for more information.
.SH "SEE ALSO"
.BR exonerate (1),
